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Abstract: The relationships between soil total nitrogen (STN) and influencing factors are scale-dependent. 
The objective of this study was to identify the multi-scale spatial relationships of STN with selected 
environmental factors (elevation, slope and topographic wetness index), intrinsic soil factors (soil bulk 
density, sand content, silt content, and clay content) and combined environmental factors (including the first 
two principal components (PC1 and PC2) of the Vis-NIR soil spectra) along three sampling transects 
located at the upstream, midstream and downstream of Taiyuan Basin on the Chinese Loess Plateau. We 
separated the multivariate data series of STN and influencing factors at each transect into six intrinsic mode 
functions (IMFs) and one residue by multivariate empirical mode decomposition (MEMD). Meanwhile, we 
obtained the predicted equations of STN based on MEMD by stepwise multiple linear regression (SMLR). 
The results indicated that the dominant scales of explained variance in STN were at scale 995 m for transect 
1, at scales 956 and 8852 m for transect 2, and at scales 972, 5716 and 12,317 m for transect 3. Multi-scale 
correlation coefficients between STN and influencing factors were less significant in transect 3 than in 
transects 1 and 2. The goodness of fit root mean square error (RMSE), normalized root mean square error 
(NRMSB), and coefficient of determination (R?) indicated that the prediction of STN at the sampling scale 
by summing all of the predicted IMFs and residue was more accurate than that by SMLR directly. Therefore, 
the multi-scale method of MEMD has a good potential in characterizing the multi-scale spatial relationships 
between STN and influencing factors at the basin landscape scale. 
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1 Introduction 


Soil total nitrogen (STN) is a key indicator of soil fertility and quality, and is closely related to land 
productivity (Franzluebbers and Stuedemann, 2009). As an important part of STN, available 
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nitrogen supplies the necessary macro nutrient for plant growth and leaf photosynthesis (Boussadia 
et al., 2010). However, excessive nitrogen content in soil is one of the largest contributors of 
non-point pollution, such as soil acidification, eutrophication, water-quality problems, and even 
gaseous emissions (Carpenter et al., 1998; Rode et al., 2009; Velthof et al., 2014). 

The distribution of STN exhibits spatial variability, which is caused by environmental factors 
including climate, topography, soil parent material, land use and human activity (Liu et al., 2008; 
Basso et al., 2016). In addition, the spatial variability of STN is scale-dependent as different factors 
and processes may operate at different scales and intensities (Hu et al., 2014). Thus, the spatial 
variability of STN is a function of scale (Lin et al., 2005; Wang et al., 2010). However, recent 
studies mainly focused on the inherency of spatial variability in STN at a single scale (Momtaz et 
al., 2009; Jacksonblake et al., 2012). It is essential to exploring the spatial patterns of STN and the 
efficient strategies for predicting the distribution of STN in the same region at multiple scales. 

Traditional methods, such as Pearson's linear analysis, multi-fractal analysis, wavelet analysis 
and geostatistics, can explore multi-scale spatial relationships between soil properties and 
influencing factors. For example, the multi-scale spatial relationships between soil physical 
properties and saturated hydraulic conductivity were examined by multi-fractal analysis (Zeleke 
and Si, 2005). Zhu et al. (2016) analyzed the scale-specific relationships between soil available 
micronutrients and environmental factors at the Fenhe River Basin on the Chinese Loess Plateau by 
wavelet analysis. These methods assume that the distributions of soil properties and the related 
processes (soil genetic processes and processes of environmental factors influencing on soil 
properties) are linear (She et al., 2015). However, soil genetic processes that influence the spatial 
variability of STN may be nonlinear, because the effect of different processes may not be additive 
or follow the principle of superposition (Hu et al, 2013). Multivariate empirical mode 
decomposition (MEMD), proposed by Fleureau et al. (2011), is an extended empirical mode 
decomposition (EMD) algorithm. Unlike traditional methods, MEMD does not require any 
assumption about the data. Therefore, it may provide a possibility to analyze the non-stationary and 
nonlinear processes. 

Taiyuan Basin, located in the central and eastern Chinese Loess Plateau and in the middle of 
Shanxi Province, China, is a typical Cenozoic fault basin. The topography of the basin is 
characterized as a lower-lying center with a high boundary, and the climate is a normal basin climate 
with rainy summers, short springs and autumns, dry-cold winters, abundant sunshine hours and large 
temperature difference between day and night (Li et al., 2000). Due to the thick layer of loess capped 
in the surface of the basin, the soil is susceptible to serious erosion. In addition, mosaic land-use 
pattern of the basin has led to the spatial distribution of STN to be more nonlinear and irregular (She 
et al., 2015). Taiyuan Basin is the major agricultural region in Shanxi Province. Precision analysis of 
the spatial distribution of STN is needed for agriculture management in this basin. With the 
development of Vis-NIR spectroscopy, the high-quality Vis-NIR spectrum is easily available, which 
provides inexpensive information of STN. Understanding the multi-scale spatial correlations of 
Vis-NIR soil spectrum and STN is needed for spatial prediction of STN in Taiyuan Basin. 

The multi-scale of soil properties has been previously studied using MEMD (Hu and Si, 2013; 
She et al., 2015), but these studies mainly focused on soil physical properties at small watershed 
scales, while little is known about the validity of MEMD for the spatial variability of STN at a 
medium-sized basin scale. The objective of this study was to explore the scale-specific 
relationships between STN and influencing factors in Taiyuan Basin of the Chinese Loess Plateau 
using MEMD analysis. Specifically, the dominant scales of the variability in STN were identified 
at the upstream, midstream and downstream of the basin, and the dominant controls of the 
influencing factors on STN at the corresponding scales were determined. 


2 Materials and methods 


21 Study area 
Taiyuan Basin (37?00'-38?20'N, 111?30'-113?00'E; 700—1000 m a.s.1.) is located in the central and 
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eastern Chinese Loess Plateau, China. The basin is enclosed by hills and mountains and covers an 
area of 6159 km’. Taiyuan Basin is characterized by a typical semi-arid climate with annual mean 
temperature of 9.5°C, mean annual precipitation of 425—520 mm, and mean annual evaporation of 
1780 mm. Due to the dust deposition during the Quaternary, the landform in Taiyuan Basin is 
characterized by a thickly loess-covered layer. The thickness of the loess layer ranges from 50 to 
3000 m, and the grain size of the loess generally increases from the center to the margin of the 
basin (Zhu et al., 2016). Fenhe River, the second largest tributary of the Yellow River, runs through 
the basin from northeast to southwest. The major soil types are Calcaric Fluvisols and Calcaric 
Cambisols under alkaline conditions according to the FAO-90 soil classification. system 
(Nachtergaele et al., 2009). The dominant crops in the basin are corn, wheat and millet. 


2.2 Experimental design and data collection 


We divided the basin into three parts according to the direction of Fenhe River and the elevation of 
the basin: upstream, midstream and downstream. Further, based on remote sensing images and land 
use types, we established sampling transects 1, 2 and 3 along the vertical direction of Fenhe River 
at the upstream, midstream and downstream part of the basin, respectively (Fig. 1). The transect 
was around 42x10? m long for each. The land use types were dominated by cropland and forest 
land for transects 1 and 2. Specifically, cropland accounted for about 95% of the total land in the 
transect, with the main plant species of corn, while forest land accounted for about 5% of the total 
land, which was mainly located at the end of the transect. For transect 3, the land use was mainly 
covered by cropland, with the main crops of corn and the mixture of corn and fruit tree. The 
specific tillage methods were different for the lands belonging to different owners. Tillage methods 
included human-powered tillage and mechanical work. 


112°E 113°E 


38°N 


37°N 37°N 


112°E 113°E 


Fig. 1 Distribution of sampling transects and sampling points in the three transects in Taiyuan Basin. Sampling 
transects 1, 2 and 3 are located at the upstream, midstream and downstream part of the basin, respectively. 


A total of 383 soil sampling points was designed in Taiyuan Basin with 121, 128 and 134 points 
for transects 1, 2 and 3, respectively. The interval between sampling points in each transect is 330 
m and those sampling points are shown in Figure 1. It should be pointed out that if the sampling 
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point was located on the non-agricultural land such as buildings or roads, then the nearest point on 
the agricultural land was used to represent this sampling point. The location of each sampling point 
was measured by a GPS receiver. 

The field investigation was conducted during 12-31 March, 2016. At each sampling point, the 
undisturbed soil was collected using a metallic core cylinders of 100 cm? volume (5 cm in height 
and 5 cm in diameter) in the surface soil layer to determine soil bulk density (BD). BD was 
calculated after oven-drying the core cylinders at 105°C for 24 h using the dried soil weight and 
volume of the core cylinder (Hossain et al., 2015). At each sampling point, five soil samples were 
collected from the surface soil layer of 0-20 cm and then they were mixed for one sample to further 
analysis. These samples were air-dried, gently crushed, and passed through a 2-mm sieve. 
Sub-samples were finely ground to pass through a 0.15-mm sieve for STN measurement. The 
content of STN was determined by the Kjeldahl method (Bremner and Tabatabai, 1972). Contents 
of sand (0.050—2.000 mm), silt (0.002—0.050 mm) and clay (<0.002 mm) were determined by the 
pipette method (Gee and Bauder, 1986). Approximately 30 g of soil was placed in a soil holder (1.5 
cm in height and 8.0 cm in diameter) and scanned for the Vis-NIR reflectance spectra by the 
Vis-NIR spectroscopy of ASD FieldSpec3 (Analytical Spectral Device Inc., Boulder, USA) with 
the spectral range of 350—2500 nm and resampled resolution of 1.0 nm (1.4 nm in the range of 
350-1000 nm and 2.0 nm in the range of 1001—2500 nm) under laboratory conditions. 

The digital elevation model (DEM) with 30-m resolutions for Taiyuan Basin was downloaded 
from the Geospatial Data Cloud (http://www.gscloud.cn/sources). It was used to extract 
topographic indices, including elevation, slope gradient and topographic wetness index (TWI) by 
ArcGIS 10.5 software (ESRI Inc., USA). 


2.3 Data analysis 


Soil Vis-NIR reflectance spectroscopy is regarded as a promising approach to efficiently obtain soil 
chemical and physical properties because it is a physically based, rapid, inexpensive, 
non-destructive and reproducible method (Hu et al., 2015). In this study, the spectra in the range of 
400—2450 nm was transformed using the principal component analysis after removing the noisy 
portions at 350—400 and 2451-2500 nm in the MATLAB program (Zhou et al., 2016). The first 
two principal components (PCI and PC2) of the Vis-NIR soil spectra, which respectively 
accounted for 89% and 4% of the total variance, were selected as the combined environmental 
factors. The spatial series of STN along with soil physical attributes (BD and sand, silt and clay 
contents), topographic factors (elevation, slope and TWI) and soil spectral components (PC1 and 
PC2) constituted the multivariate data series. 

The multivariate data series of each of the three sampling transects were decomposed into 
different IMFs (intrinsic mode functions, IMF1-IMF6) and residues by MEMD method, which was 
implemented using the MATLAB program developed by Rehman and Mandic (2009). A detailed 
description about MEMD can be found in studies of Rehman and Mandic (2010) and Hu and Si 
(2013). 

The STN at each IMF and residue was predicted from the corresponding IMF and residue of the 
influencing factors by stepwise multiple linear regression (SMLR). The predicted STN contents at 
the sampling scale were obtained by adding all the predicted values of STN at each IMF and 
residue using the following equation: 


t 
STN? = M IMF? -R?, (1) 


where STNP is the predicted STN value at the sampling scale (g/kg); t is the number of IMF; IMF? 
is the predicted value of STN at the i" IMF; and R? is the predicted value of STN at the residue. 

The accuracy of STN prediction was evaluated by three statistical indices, 1.e., coefficient of 
determination (R°), root mean square error (RMSE) and normalized root mean square error 
(NRMSE), from the observed and predicted values of STN. The three statistical indices can be 
calculated using the following equations: 
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where n is the number of sampling points of STN; Y" and Y^ are the measured and predicted 


values of STN at the sampling point i (g/kg), respectively; and Y is the mean of all measured 
STN contents (g/kg). The descriptive statistical analysis for STN, Pearson's correlation analysis 
between STN and influencing factors, SMLR for STN at the sampling scale, each IMF and residue 
were carried out using SPSS 19.0 software (SPSS Inc., USA). 


3 Results 


3.1 Statistical analysis of STN with influencing factors at the sampling scale 


The descriptive statistical results of the overall variability of STN in the three sampling transects 
and in all samples are shown in Table 1. Of the three sampling transects, the mean value of STN in 
transect 1 was the smallest, while the CV of STN in transect 3 was the lowest. The average 
elevation was lowest in transect 2, while its CV value was lower in transects 1 and 2 than in 
transect 3. The CV values of STN, elevation, TWI, silt and clay were similar in transects 1 and 2. 
Among the three transects, the mean values of PC1 and PC2 were similar, but there were some 
differences in their CV values. 


Table 1 Descriptive statistics of STN and influencing factors in the three transects and in all samples 


STN Elevation Slope TWI BD 
Mean CV Mean CV Mean CV Mean CV Mean CV 
(g/kg) (%) (m) (%) C) (%) (%) (g/cm?) (%) 
Transect 1 1.09 27.17 769.56 3.89 3.34 80.59 0.42 241.39 1.27 11.18 
Transect 2 1,27 27.09 743.76 2.56 2.34 95.09 0.48 241.37 1.30 9.90 
Transect 3 1.17 22.33 753.39 9.05 2.40 74.13 0.88 181.36 1.30 8.91 
All samples 1.18 26.26 755.28 6.11 2.68 85.38 0.60 216.93 1.29 10.01 
Sand Silt Clay PCI PC2 
Mean G Mean w Mean e Mean bs Mean rs 
Transect 1 0.33 54.36 0.39 32.18 0.28 36.25 6.39 20.56 2.74 5.77 
Transect 2 0.36 47.67 0.40 30.97 0.25 37.99 6.34 14.35 2.23 6.47 
Transect 3 0.24 51.37 0.47 21.28 0.30 27.23 6.03 22.25 2.56 6.40 
All samples 0.30 53.64 0.42 28.55 0.27 34.00 6.25 19.41 2.50 10.37 


Note: STN, soil total nitrogen; TWI, topographic wetness index; BD, bulk density; PCI, the first principal component; PC2, the second 
principal component; CV, coefficient of variance. 


Spatial distributions of STN and influencing factors in the three sampling transects are shown in 
Figure 2. The distribution of elevation along the three transects was characterized by a bowl-shaped 
topographic depression, and the range of elevation in transect 3 was obviously larger than those in 
transects 1 and 2. STN and other influencing factors displayed different distributions compared 
with elevation in the three transects. The ranges of PC1 were similar among the three transects, 
while the values of PC2 varied greatly, with the ordering of transect 2<transect 3<transect 1. 
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Fig. 2 Spatial distributions of STN (soil total nitrogen) and influencing factors along the three transects at the 
sampling scales 


The correlation coefficients between STN and influencing factors at the sampling scale were 
determined by Pearson's linear analysis (Table 2). For transect 1, STN was significantly correlated 
with elevation, BD, sand, silt, clay and PC1 at P«0.01 level and with PC2 at P<0.05 level. There 
was no significant correlation between STN and TWI (P>0.05). For transect 2, STN was 
significantly correlated with elevation, slope, BD, sand, silt, clay and PC1 at P«0.01 level. For 
transect 3, STN was only significantly correlated with elevation and PC1 at P<0.01 level, and with 
sand and silt at P«0.05 level. Overall, in the entire basin, STN was significantly related to 
elevation, soil physical properties (BD, sand, silt and clay), and the combined environmental 
factors (PC1 and PC2), while significant relationships of STN with slope and TWI were not 
detected. 


Table 2 Correlation coefficients between STN and influencing factors in the three transects and in all samples 


Influencing factor 


Elevation Slope TWI BD Sand Silt Clay PCI PC2 
Transect 1 -0.35" —0.06 0.06 -0.33" -0.50" 049" 027" 036" 0.22 
Transect 2 -0.25" 0.23" 0.02 -0.25"* —0.50™ 0.48" 0.28" 0.56" —0.13 
Transect 3 0.34" —-0.14 0.11 —0.09 -0.19* 0.22" 0.02 0.42™ —0.13 
All samples -0.31" —0.01 0.06 -0.21" -0.377 0.38" 0.16" 0.41” —0.20™ 


Note: *, significant correlation at P<0.05 level; **, significant correlation at P<0.01 level. 


Predicted equations of STN with one or more influencing factors were obtained using SMLR 
method, which explained 49%, 46% and 34% of the total variance in STN for transects 1, 2 and 3, 
respectively (Table 3). It can be seen that the accuracy of STN prediction in the three transects was 
generally low, especially for transect 3. 


3.2 Multi-scale spatial relationships between STN and influencing factors 


The multivariate data series of STN and influencing factors for each transects were decomposed 
into six different modes of oscillation (IMF1-IMF6) and one residue by MEMD method, which are 
presented in Figure 3. For each IMF, the number and width of oscillations between STN and 
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Table3  Predicting equations of STN with influencing factors using SMLR for the three transects 


Transect Equation F R 
Transect 1 STN=~0.27+1.21(0.52)xClay+0.71(0.24)xSilt+0.11(0.48)xPC1 37.0 0.49 
Transect 2 STN=0.40+0.03(0.18)xSlope-0.43(0.16)xBD+0.82(0.29)Clay+0.16(0.43)xPC1 26.2 0.46 
Transect 3 STN=2.50-0.01(0.31)xElevation+0.04(0.22)xTWI-0.32(0.15)xSand+0.07(0.37)xPC 1— 13.1 034 


regression coefficients. 


0.32(0.20)xPC2 
Note: SMLR, stepwise multiple linear regression; R?, adjusted coefficient of determination. Numbers in parentheses are the standardized 
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Fig.3 IMFs (intrinsic mode functions) and residues for STN and influencing factors in the three transects. IMF1— 
IMF6, intrinsic mode function 1—6, respectively. 


influencing factors were similar, while for different IMFs, the oscillation modes differed greatly. 
The scales of STN and influencing factors were identified by Hilbert transform from the oscillatory 
mode at each IMF, and the mean scales of STN and influencing factors were used to represent the 
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scale of each specific IMF (Table 4). The mean scales of IMF1-IMFA4 were similar in the three 
sampling transects, which were around 1000, 1500, 2700 and 4500 m for IMF1, IMF2, IMF3 and 
IMF4, respectively. For IMF5, the mean scale was similar in transects 1 and 2 (about 8500 m), and 
the scale in transect 3 was about 5700 m. For IMF6, the mean scale was similar for transects 1 and 
3 (approximately 12,500 m), and the mean scale in transect 2 was about 15,500 m. The differences 
in the identified scale for IMF6 were very large, with the CV values of 15.84%, 16.32% and 23.12% 
for transects 1, 2 and 3, respectively. 


Table4 Actual scales of STN and influencing factors for each intrinsic mode function (IMF) in the three transects 


Transect Factor Soale (W) 
IMF1 IMF2 IMF3 IMF4 IMF5 IMF6 
STN 1033 1568 2712 5005 8421 13,954 
Elevation 981 1595 3070 5024 8074 14,539 
Slope 986 1733 2717 5183 7299 13,421 
TWI 1021 1642 2739 4916 8244 10,394 
BD 1015 1544 2622 5288 8319 14,273 
Vent Sand 981 1540 2918 5319 8451 10,373 
Silt 988 1593 2845 4658 8931 10,591 
Clay 963 1610 3138 5742 8210 10,979 
PCI 988 1563 2756 4996 9079 15,832 
PC2 989 1598 2680 4674 8379 13,869 
Mean 995 1599 2819 5081 8341 12,823 
CV (%) 2.16 3.54 6.09 6.34 5.79 15.84 
STN 922 1557 2773 4369 8898 15,090 
Elevation 979 1594 2996 5075 9716 20,515 
Slope 911 1466 2685 4812 9099 16,733 
TWI 984 1421 2823 4985 8766 14,414 
BD 923 1426 2545 4854 7410 12,244 
— Sand 964 1483 2866 4389 9490 14,461 
Silt 1007 1444 2708 4305 9480 12,271 
Clay 904 1466 2876 4293 8887 15,431 
PCI 1032 1466 2648 4175 7929 18,210 
PC2 932 1456 2519 4625 8848 16,364 
Mean 956 1478 2744 4588 8852 15,573 
CV (96) 4.60 3.76 5:53 7.06 8.03 16.32 
STN 954 1444 2901 3865 5880 9199 
Elevation 973 1459 2742 4,856 6507 15,728 
Slope 984 1433 2598 4098 5149 14,738 
TWI 1079 1514 2743 4470 6136 15,348 
BD 925 1567 2657 4241 5748 10,081 
Tanet Sand 941 1614 2414 4150 5626 11,672 
Silt 964 1566 2670 4482 5375 10,888 
Clay 934 1567 2494 3817 5576 9793 
PCI 989 1439 2614 4132 5444 9534 
PC2 975 1412 2598 4355 6320 16,194 
Mean 972 1502 2643 4247 5716 12,317 
CV (96) 4.45 4.83 5.15 7.30 7.24 23.12 


Note: IMF 1-IMF6, intrinsic mode function 1—6, respectively. 


The variance explained by each IMF and residue for STN and influencing factors in the three 
transects are shown in Table 5. For STN, the total variance explained by the IMF and residue was 
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ranked as follows: transect 1<transect 2<transect 3, with the percentages of 80.84%, 84.29% and 
87.43% for transects 1, 2 and 3, respectively. The largest variance in STN was IMF1 for transect 1, 
IMF! and IMF5 for transect 2, and IMF1 for transect 3. The variance percentage of STN and 
influencing factors (with the exception of elevation) was mainly contributed by IMFI, while the 
variance in elevation was mainly contributed by residue. Finally, with the exceptions of sand 
content in transects 1 and 3 and PC2 in transect 3, the sum of the variance percentage in the six 
IMFs and the residue for each influencing factor in the three transects were less than 100.00%. 


Table 5 Percentage of variance explained by each IMF and residue for STN and influencing factors in the three 
transects 


Percentage of variance (%) 


Transect Factor 
IMFI IMF2 IMF3 IMF4 IMF5 IMF6 Residue Total 
STN 27.01 13.22 13.75 3.45 1.81 7.10 14.50 80.84 
Elevation 0.29 0.27 1.51 2.91 0.67 3.86 84.25 93.76 
Slope 42.57 8.06 12.33 11.66 4.85 3.65 8.90 92.02 
TWI 43.15 20.62 12.72 6.20 7.04 0.67 5.00 95.40 
T — BD 23.89 11.34 8.96 6.62 2.82 14.67 6.16 74.46 
Sand 22.72 9.11 8.40 6.82 1.85 5.39 46.13 100.42 
Silt 26.11 10.39 7.61 6.08 1.70 325 39.73 94.87 
Clay 31.22 14.59 9.95 5.48 3.44 6.56 16.31 87.55 
PCI 20.01 7.89 7.12 6.74 7.86 17.55 29.39 96.56 
PC2 26.26 13.36 13.25 3.92 7.70 7.15 19.03 91.27 
STN 26.48 12.10 9.53 3.67 21.69 2.97 7.85 84.29 
Elevation 0.51 0.48 0.23 0.83 2:53 0.82 79.97 85.37 
Slope 17.52 8.14 9.64 8.08 11.30 8.59 14.85 78.12 
TWI 25.89 15.14 12.38 8.92 7.24 2.80 5.38 77.75 
T — BD 34.26 12.61 10.23 10.06 5.09 0.77 9.12 82.14 
Sand 15.66 9.29 6.60 2.72 19.19 8.58 29.24 91.28 
Silt 17.53 13.28 8.53 2.96 19.73 4.31 17.55 83.89 
Clay 27.76 12.46 11.14 5.25 6.30 1.32 20.77 91.00 
PCI 27.43 14.38 15.04 6.09 8.13 1.21 6.43 78.71 
PC2 43.84 13.02 12.24 6.14 2.93 0.85 0.95 79.97 
STN 24.49 9.33 11.95 2.81 16.36 14.05 8.44 87.43 
Elevation 0.05 0.09 0.32 0.05 1.86 19.19 74.39 95.95 
Slope 30.44 9.47 12.63 8.19 10.69 9.93 8.57 89.92 
TWI 39.38 15.14 9.79 4.02 5.60 7.46 1.36 82.75 
cce BD 24.84 12.91 10.08 11.32 13.66 2.70 18.09 93.60 
Sand 26.84 11.64 15.49 6.04 9.30 4.28 26.66 100.25 
Silt 27.53 15.23 17.10 6.97 10.16 2.19 13.10 92.28 
Clay 34.11 12.72 15.26 6.50 7.16 5.89 18.15 99.79 
PCI 15.26 5.52 6.53 5.88 36.88 25.67 5.20 99.94 
PC2 22.13 6.88 7.45 6.10 14.59 28.34 14.53 100.02 


As shown in Table 6, the correlation coefficients between STN and influencing factors for all 
IMFs and residue based on MEMD varied greatly at different scales. The relationships between 
STN and influencing factors in transects 1 and 2 were stronger than that in transect 3. The effect of 
topographic factors on STN was identical among the three sampling transects, which was 
significant at a few scales. The correlation of PC1 with STN was similar among the three transects, 
which was significant at most scales. The relationships between soil physical properties and STN 
were similar in transects 1 and 2, which were significant at most scales, while the relationships 
were significant at a few scales for transect 3. Additionally, the correlations between STN and 
influencing factors were significant for the residue in the three sampling transects. 
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Table 6 Correlation coefficients between STN and influencing factors for each IMF and residue based on 
multivariate empirical mode decomposition (MEMD) in the three transects 
IMF/ 


Transect Residue Elevation Slope TWI BD Sand Silt Clay PCI PC2 
IMFI —0.02 0.22* 0.04 —0.16 -0.48" 0.39% 0.28" 0.17 0.17 
IMF2 0.12 —0.06 0.17 0.377 0.48" 0.37" 0.28" 0.537 | -0.14 
IMF3 —0.22° 0.05 0.307 = -0.21" —0.28"* 0.50"  —0.10 0.707 0.23" 
Transect 1 IMF4 0.367 0.00 -0.11 -0.677  -04T" 0.23* 0.62" 0.397 0.40" 
IMF5 —0.15 —0.10 0.19" -0.41"  -0.62" 0.40" 0.43"  -027"  -0.25" 
IMF6 -0.59" —0.10 0.18" -0.767 | -0.47 0.39" . -0.08 0.42"  -0.13 
Residue -0.96" -0.98" 0.98"  -0.46"  -0.86" 0.90" 0.75" | -031" 0.90" 
IMFI —0.04 0.18* 0.14 0.26" 0.28" 0.18" 0.19* 0.14 0.17 
IMF2 0.22* -0.27* 0.01 —0.23* | -023" 0.20* 0.10 0.29" 0.16 
IMF3 0.40" —0.11 0.02 —0.19* -0.41* 0.16 0.41"  —0.03 0.29" 
Transect 2 IMF4 -0.26* 0.48" —0.02 0.11 —0.22* 0.09 0.20"  -023" -0.17 
IMF5 0.07 0.47 0.49" | -0.52" | -0.95" 0.97 0.62" 0.24" | -0.78" 
IMF6 -0.47" —0.08 0.87" 0.41" 0.88" 0.84" 0.91" | -043"  -0.66" 
Residue -1.00" 0.98" 0.89" 0.87" 0.75" 0.85" 0.607 | -0.97" 0.19* 
IMFI 0.04 0.03 0.18" 0.06 -0.41" 0.48" 0.03 027" | -0.02 
IMF2 -0.23" —0.05 0.13 0.35" | -022 0.28" | -0.06 0.11 —0.05 
IMF3 0.06 0.00 0.607 0.14 0.04 0.01 0.07 0.25" 0.21* 
Transect 3 IMF4 -0.26" 0.07 0.74" | -0.16 -0.25* 0.22* 0.09 0.24" 0.36" 
IMF5 -0.22* —0.16 0.327  -0.767 | -0.29" 0.06 0.42" 0.47" | -0.69" 
IMF6 -0.64"* -0.40* 0.12 0.49" 0.527" 0.88" 0.01 031"  -022 
Residue -0.52" -0.27" 0.18" 0.31" = -0.32" 0.29" 0.33" 0.7227  -037" 


Note: *, significant correlation at P<0.05 level; **, significant correlation at P<0.01 level. 


3.3 STN prediction based on multi-scale spatial relationships of STN and influencing factors 


Based on multi-scale effects of influencing factors on STN, we derived the predicted equations of 
STN at each scale using SMLR. The results are shown in Table 7. The R? value of all predicted 
equations ranged from 0.28 to 1.00, generally increasing from IMF1 to IMF6. STN at the sampling 
scale was predicted by adding all predicted values of STN at each IMF and residue, and its 
prediction accuracy is shown in Table 8. Based on multi-scale spatial relationships between STN 
and influencing factors from MEMD, we can see that the R? values between measured STN and 
predicted STN at the sampling scale were 0.82, 0.80 and 0.77 for transects 1, 2 and 3, respectively. 
They were notably higher than those (0.49, 0.46 and 0.34 for transects 1, 2 and 3, respectively) 
derived from the original dataset by SMLR at the sampling scale. The RMSE and NRMSE values 
for the predicted equations derived by MEMD were lower than those obtained by SMLR. 

The correlation coefficients between predicted STN by each IMF (residue) from all influencing 
factors and measured STN at the sampling scale are shown in Figure 4a. The results showed that 
IMF1, IMF2 and IMF3 almost contributed equally to the overall prediction of STN in transect 1, 
IMF1 and IMF5 were the dominant contributors for transect 2, while IMF1 and IMF6 were the 
main contributors in transect 3. The correlation coefficients between measured STN at the sampling 
scale and STN predicted by each influencing factor from different IMF scales are shown in Figure 
4b. These results showed that sand, silt and PC1 were the main explanatory factors for STN 
prediction in transect 1, sand and silt were the dominant explanatory factors in transect 2, while 
elevation, BD and PCI were the main factors for STN prediction in transect 3. 


4 Discussion 


The spatial distributions of STN and the relationships of STN with influencing factors were 
scale-dependent. To reveal the scale-specific effects of influencing factors on STN, we analyzed 
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Table 7 Predicted equations and regression statistics (F value and adjusted R?) for STN at each IMF and residue 


using SMLR based on MEMD 
IMF/ ; 
Transect Residue Equation F R 
STN- -0.01(0.15)xElevation--1.29(0.54)xSilt*-0.84(0.31)xClay-0.10(0.37)x 
IME PC1+0.29(0.15)xPC2 HET y 
mp2 STN=-0.01+0.01(0.14)xElevation+0.39(0.17)xBD+1.09(0.42)xSilt+0.50(0.18)* T fai 
Clay+0.11(0.38)xPC1-0.31(0.17)xPC2 : 
STN=0.01+0.01(0.11)xElevation+0.02(0.16)xSlope+0.08(0.27)xTWI+ 
IME? 1.20(0.38)xSilt+0.19(0.60)xPCI NE. 
STN=0.01+0.01(0.87)xElevation—0.02(0.27)xSlope+0.09(0.44)xTWI+ 
oe 0.43(0.36)Sand+1.72(0.74)xClay+0.15(0.94)xPC1-0.63(0.36)*PC2 ME 
mrs STN=-0.01-0.13(1.93)xSlope-0.28(2.05)xTWI-2.88( 1.72)xBD-2.25(0.94)x aoe our 
Silt+0.71(0.34)xClay—0.14(1.30)xPC1-1.22(1.34)xPC2 i . 
STN-0.01-0.01(0.64)xElevation-0.32(2.11)xSlope-0.38(0.44)x TWI- 
IMF 6 4.26(2.93)x«BD-1.85(0.96)xSand-7.98(2.32)Silt-0.36(2.49)x«PC1— 43,351.7 1.00 
0.46(0.26)xPC2 
STN-9.53-0.26(1.85)xSlope-1.26(0.39)x«BD--0.89(0.33)«Clay-0.01(0.05)x 
Residue PC1-2.27(1.38)xPC2 1418244.7 1.00 
STN=0.05(0.24)xSlope—0.66(0.28)xBD+0.94(0.27)xSilt+1.09(0.30)xClay+ 
IMF 1 0.74(0.34)xPCD 94 0.28 
STN- —0.06(0.34)xSlope+0.72(0.27)xSilt+1.14(0.31)xClay+0.16(0.33)xPC1+ 
IMF2 0.81(0.41)xPC2 12.6 0.34 
mp3  STN=0.05(0.49)xElevation-0.03(0.28)xSlope-0.07(0.3 1)xTWI+0.39(0.19)x ah -— 
BD-1.11(0.51)xSand-0.06(0.12)xPC140.88(0.35)xPC2 : 
Transect IMF 4 STN- —0.07(1.49)xElevation+0.06(0.55)xSlope+0.25(1.31)xTWI+1.12(0.49)x 53.6 0.76 
2 BD-1.52(0.52)«Silt*3.25(0.71)xClay*0.20(0.39)x«PC1 i 
mrs  STN=0.02(0.21)xElevation+0.05(0.21)*Slope+0.19(0.33)x TWI+0.86(0.16)x ioga du 
BD--1.56(0.58)xSilt-0.62(0.10)xClay-0.19(0.22)xPC1-1.86(0.46)xPC2 vede 
STN- —0.01(0.44)xElevation+0.06(0.76)xSlope+1.08(0.54)xBD+1.78(1.31)x 
DEN Silt-0.30(0.18)xClay--0.09(0.75)xPC1-0.20(0.20)xPC2 HN 1:00 
: STN=2.69-0.01(0.33)xElevation+0.02(0.16)xSlope—0.14(0.45)xTWI+ 
Residug 0.46(0.20)xClay-0.12(0.02)xPC2 stele. N00 
IMF1 STN=-1.23(0.60)xSand-0.94(0.34)xClay+0.08(0.3 1)xPC1-0.40(0.24)xPC2 15.8 0.33 
IMF 2 STN= —0.01(0.45)xElevation+0.57(0.48)xSilt 14.0 0.28 
IMF3 STN=0.03(0.18)xSlope+0.11(0.63)xTWI+0.08(0.28)xPC1 34.7 0.45 
transect  IMF4  STN=0.04(0.44)xSlope+0.13(0.96)xTWI-0.16(0.14)xBD+0.52(0.31)*Silt 161.4 — 0.83 
3 IMF 5 STN- —0.01(0.87)xElevation+0.12(0.63)xSlope+0.15(0.55)xTWI- 288.5 0.94 
0.64(0.26)XBD-0.89(0.31)xSand-1.17(0.24)xClay-1.80(1.07)xPC2 ` : 
IMF 6 STN- —0.01(0.75)xElevation+0.02(0.13)xSlope+0.09(0.39)xTWI- 4751.6 1.00 
3.55(0.69)xBD-1.29(0.19)xSilt-0.02(0.16)xPC1—0.29(0.25)xPC2 f ! 
Residue STN=3.12+0.01(0.22)xElevation-0.18(1.25)xSlope+2.86(1.35)xSilt- 656,059.2 1.00 


0.04(0.19)xPC1-0.92(0.76)xPC2 


Note: Numbers in parentheses are the standardized regression coefficients. 


Table 8 Statistic indices used to assess the overall prediction accuracy of STN by SMLR and MEMD methods 


Method Transect RMSE (g/kg) NRMSE R 
Transect 1 0.21 0.18 0.49 

SMLR Transect 2 0.25 0.20 0.46 
Transect 3 0.21 0.18 0.34 
Transect 1 0.17 0.15 0.82 

MEMD Transect 2 0.20 0.16 0.80 
Transect 3 0.17 0.14 0.77 


Note: RMSE, root mean square error; NRMSE, normalized root mean square error. 


the relationships between STN and influencing factors at the sampling scale and at the multi-scales 
in this study. The correlations between STN and influencing factors such as elevation, BD, sand, 
silt, clay, PCI and PC2 at the sampling scale implied that the variability of STN at this scale was 
mainly correlated with BD, the balance of different soil particle sizes, elevation and combined 
environmental factors of PC1 and PC2. The effects of influencing factors on STN were relatively 
weak at the downstream than at the upstream and midstream of the basin, and STN at the 
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Fig. 4 Correlation coefficients (a) between measured STN and predicted STN by each IMF (or residue) of all 
influencing factors and (b) between measured STN and predicted STN by each influencing factor of all IMFs (and 
residue) in the three transects 


downstream was only correlated with elevation, sand, silt and PC1, indicating that local processes, 
such as tillage and biological activities induced perturbation, obscured the other expected 
relationships so that the effects were found to be non-significant. The least accurate prediction for 
STN in transect 3 by SMLR might be attributed to the lowest CV among the three transects. 

To examine the multi-scale spatial correlations between STN and influencing factors, we 
decomposed the multivariate data series using MEMD. The MEMD method can project the 
multivariate data series along different directions in an n-dimensional space, align the common 
scales of the multivariate data, and then group similar scales among different spatial series to 
represent the actual scales of the related influencing processes. The scales can be identified by the 
common oscillatory modes using Hilbert transform for all spatial series within the n-variate IMF 
(Rehman and Mandic, 2010). In our study, the scale of each IMF was represented by the mean 
scales of all spatial series. The scales from IMF1 to IMF4 in the three transects were identical, 
showing that relative processes might operate at scales of 1000, 1500, 2700 and 4500 m in Taiyuan 
Basin, respectively. The scales for IMF5 and IMF6 in the three transects were different and the 
results implied that the variation in tillage method and soil properties could affect the scale at 
which STN and influencing factors operated. However, the specific tillage method and land 
management along the three transects, such as the human-powered tillage or mechanical work, 
irrigation frequency and amount, and straw returning or not, should be further investigated for the 
explanation. The differences in the identified scale from environmental factors and STN were 
larger for IMF6 than for the other IMFs, which indicated that the common scale among STN and 
influencing factors at the large scale of 210,000 m was not obvious for the three transects. 

The distributions of variance among different IMFs and residue were different in the three 
transects. The high variance contributions of IMF1-IMF6 to STN meant that the scales of STN 
varation were mainly at the identified scales from IMF1 to IMF6 decomposed by MEMD. 
However, the high variance contribution of residue to elevation might indicate that the scales of 
dominant variation to STN at the sampling scale were larger than these at the six identified IMF 
scales, and that a larger sampling transect should be constructed to identify the scales of large 
variation for elevation. The variance percentages of STN and influencing factors, with the 
exception of elevation, were mainly contributed by IMF1, which showed that the spatial variations 
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in STN and these influencing factors were great at the scale of approximately 1000 m. Moreover, 
with the exceptions of sand content in transects 1 and 3 and PC2 in transect 3, the sum of the 
variances of IMF1 to IMF6 and the residue for the highest influencing factors and STN were less 
than 100.00% (Table 5), which might be due to the weakness of robust orthogonality between two 
different IMFs for a given influencing factor (Hu and Si, 2013). 

The multi-scale spatial relationships between STN and influencing factors indicated that sand 
content negatively while silt content positively affected STN content at almost all IMF scales in the 
whole basin, and that clay content positively correlated with STN especially at the upstream and 
midstream of the basin. The results coincide with previous report, which indicated that 
fine-textured soil tends to store more total nitrogen at regional and sub-regional scales (Li et al., 
2016). Elevation was negatively correlated with STN, especially at large scales, which might be 
attribute to the large variation of elevation at large scales. In a small watershed of Chinese Loess 
Plateau, Hu et al. (2013) pointed out that the relationship between soil water content and elevation 
is significant at large scales. Soil BD negatively affected STN at some scales at the downstream of 
Taiyuan Basin, which were overlooked at the sampling scale. PC1 was correlated with STN at 
almost all IMF scales, while PC2 was correlated with STN at some IMF scales. Previous studies 
indicated that the variation of soil Vis-NIR spectra depended on soil components, including STN 
content (Rossel et al., 2006); therefore, the two principal spectra components of PC1 and PC2 were 
related to STN in Taiyuan Basin. The correlation between STN and PCI was positive at the 
sampling scale in Taiyuan Basin, but the correlation at multi-scales was positive at the upstream 
and downstream of the basin and negative at the midstream. This might be from the low values of 
PCI at the midstream. In addition, the multi-scale spatial correlations between STN and 
influencing factors in transect 3 were weaker than those in transects 1 and 2, which might be due to 
the greater variation in tillage methods resulting from the high variation of topography in transect 
3. 

In this study, we obtained the predicted equations of STN at multi-scales by SMLR based on 
MEMD. The increasing trends in the adjusted R? values of STN predicted equations from IMFI to 
IMF6 might indicate that when the effects of the selected influencing factors on STN are more 
dominant at large scales, the predicted equations are more accurate. The prediction of STN at the 
sampling scale by summing all of the predicted IMFs and residue was more accurate than that by 
using SMLR directly. Thus, the correlations between STN and influencing factors at the sampling 
scale did not reveal the complex relationships between them. 

The correlation between the overall STN at the sampling scale and at each predicted IMF 
(residue) or between the overall STN at the sampling scale and the total predicted STN by each 
influencing factor from different IMF scales was analyzed for STN prediction at the sampling scale. 
The results indicated that among the different IMF scales decomposed by MEMD, the main 
contributors for the overall STN prediction were at scales «3000 m for transect 1, at scales around 
1000 and 8000 m for transect 2, and at scales of 1000 and 12,000 m for transect 3. This showed 
that the leading processes influencing STN content in Taiyuan Basin occurred at the small scale 
around 1000 m. In addition, the predicted residue, which might show larger scales, was 
significantly correlated with the overall STN prediction. This finding indicated that the processes 
influencing STN distribution in Taiyuan Basin might also be located at scales 215,000 m, which 
could be analyzed from longer sampling transects in future studies. The correlation between the 
overall STN at the sampling scale and the predicted STN by each influencing factor indicated that 
soil texture was a dominant explanatory factor in STN prediction at the upstream and midstream of 
the basin, PCI played a dominant role at the upstream and downstream, and elevation had an 
important contribution at the downstream due to its larger variance in transect 3. 


5 Conclusions 


In this study, MEMD analysis was used to investigate the multi-scale spatial relationships between 
STN and influencing factors in Taiyuan Basin located in the Chinese Loess Plateau. The overall 
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multivariate series of STN and influencing factors were decomposed into six IMFs and one residue 
for each transect. The dominant scales of variances in STN existed in IMF 1 at the upstream, IMF1 
and IMFS at the midstream, and IMF1, IMF5 and IMF6 at the downstream. Multi-scale spatial 
correlation coefficients between STN and influencing factors were less significant at the 
downstream than at the upstream and midstream, due to the high variation in topography. The 
prediction of STN at the sampling scale by summing all of the predicted IMFs and residue was 
more accurate than that by SMLR directly. The leading processes influencing STN contents in the 
entire basin occurred at IMF1 (scale around 1000 m). The influencing factors of sand, silt and PCI 
were the main contributors for STN prediction at the upstream, sand and silt were the dominant 
explanatory factors at the midstream, and elevation, BD and PC1 were the main factors at the 
downstream of Taiyuan Basin. The results obtained from this study can be used to understand the 
STN spatial variations and STN predictions in the study area and other similar regions in the world. 
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